178 research outputs found

    Reliable Parallel Solution of Bidiagonal Systems

    Get PDF
    This paper presents a new efficient algorithm for solving bidiagonal systems of linear equations on massively parallel machines. We use a divide and conquer approach to compute a representative subset of the solution components after which we solve the complete system in parallel with no communication overhead. We address the numerical properties of the algorithm in two ways: we show how to verify the ? posteriori backward stability at virtually no additional cost, and prove that the algorithm is ? priori forward stable. We then show how we can use the algorithm in order to bound the possible perturbations in the solution components

    From techno-scientific grammar to organizational syntax. New production insights on the nature of the firm

    Get PDF
    The paper aims at providing the conceptual building blocks of a theory of the firm which addresses its "ontological questions" (existence,boundaries and organization) by placing production at its core. We draw on engineering for a more accurate description of the production process itself, highlighting its inner complexity and potentially chaotic nature, and on computational linguistics for a production-based account of the nature of economic agents and of the mechanisms through which they build ordered production sets. In so doing, we give a "more appropriate" production basis to the crucial issues of how firm's boundaries are set, how its organisational structure is defined, and how it changes over time. In particular, we show how economic agents select some tasks to be performed internally, while leaving some other to external suppliers, on the basis of criteria based on both the different degrees of internal congruence of the tasks to be performed (i.e. the internal environment), and on the outer relationships carried out with other agents (i.e. the external environment)

    Analysis of a Wireless Sensors Dropping Problem in Environmental Monitoring

    Get PDF
    In this paper we study the following problem: we are given a certain region R to monitor and a requirement on the degree of coverage (DoC) of R to meet by a network of deployed sensors. The latter will be dropped by a moving vehicle, which can release sensors at arbitrary points within R. The node spatial distribution when sensors are dropped at a certain points is modeled by a certain probability density function F. The network designer is allowed to choose an arbitrary set of drop points, and to release an arbitrary number of sensors at each point. Given this setting, we consider the problem of determining the optimal deployment strategy, i.e., the drop strategy such that the DoC requirement is fulfilled and the total number of deployed nodes n is minimum. We study this problem both analytically and through simulation, under the assumption that F is the two-dimensional Normal distribution of parameter s (sigma) centered at the drop point. We show that, for given value of s (sigma) and DoC requirement, optimal deployment strategies can be easly identified. The sensor dropping problem studied in this paper is relvant whenever manual node deployment is impossible or overly expensive, and partially controlled deployment (the network designer can choose the drop points, but the final node deployment is random) is the only feasible choice

    Investigating Power and Limitations of Ensemble Motif Finders Using Metapredictor CE3

    Get PDF
    Ensemble methods represent a relatively new approach to motif discovery that combines the results returned by "third-party" finders with the aim of achieving a better accuracy than that obtained by the single tools. Besides the choice of the external finders, another crucial element for the success of an ensemble method is the particular strategy adopted to combine the finders' results, a.k.a. learning function. Results appeared in the literature seem to suggest that ensemble methods can provide noticeable improvements over the quality of the most popular tools available for motif discovery. With the goal of better understanding potentials and limitations of ensemble methods, we developed a general software architecture whose major feature is the flexibility with respect to the crucial aspects of ensemble methods mentioned above. The architecture provides facilities for the easy addition of virtually any third-party tool for motif discovery whose code is publicly available, and for the definition of new learning functions. We present a prototype implementation of our architecture, called CE3 (Customizable and Easily Extensible Ensemble). Using CE3, and available ensemble methods, we performed experiments with three well-known datasets. The results presented here are varied. On the one hand, they confirm that ensemble methods cannot be just considered as the universal remedy for "in-silico" motif discovery. On the other hand, we found some encouraging regularities that may help to find a general set up for CE3 (and other ensemble methods as well) able to guarantee substantial improvements over single finders in a systematic way

    CNVScan: detecting border- line copy number variations in NGS data via scan statistics

    Get PDF
    Background. Next Generation Sequencing (NGS) data has been extensively exploited in the last decade to analyse genome variations and to understand the role of genome variations in complex diseases. Copy number variations (CNVs) are genomic structural variants estimated to account for about 1.2% of the total variation in humans. CNVs in coding or regulatory regions may have an impact on the gene expression, often also at a functional level, and contribute to cause different diseases like cancer, autism and cardiovascular diseases. Computational methods developed for detection of CNVs from NGS data and based on the depth of coverage are limited to the identification of medium/large events and heavily influenced by the level of coverage. Result. In this paper we propose, CNVScan a CNV detection method based on scan statistics that overcomes limitations of previous read count (RC) based approaches mainly by being a window-less approach. The scans statistics have been used before mainly in epidemiology and ecology studies, but never before was applied to the CNV detection problem to the best of our knowledge. Since we avoid window- ing we do not have to choose an optimal window-size which is a key step in many previous approaches. Extensive simulated experiments with single read data in extreme situations (low coverage, short reads, homo/heterozygoticity) show that this approach is very effective for a range of small CNV (200-500 bp) for which previous state-of-the-art methods are not suitable. Conclusion. The scan statistics technique is applied and adapted in this paper for the first time to the CNV detection problem. Comparison with state-of-the art methods shows the approach is quite effective in discovering shortCNVin rather extreme situations in which previous methods fail or have degraded performance. CNVScan thus extends the range of CNV sizes and types that can be detected via read count with single read data

    Towards User-Aware Service Composition

    Get PDF
    Our everyday life is more and more supported by the information technology in general and specific services provided by means of our electronic devices. The AMBIT project (Algorithms and Models for Building context-dependent Information delivery Tools) aims at providing a support to develop services that are automatically tailored based on the user profile. However, while the adaptation of the single services is the first step, the next step is to achieve adaptation in the composition of different services. In this paper, we explore how services can be composed in a user-aware way, in order to decide the composition that better meets users’ requirements. That is, we exploit the user profile not only to provide her customized services, but also to compose them in a suitable way

    CE3: Customizable and Easily Extensible Ensemble Tool for Motif Discovery

    Get PDF
    Ensemble methods (or simply ensembles) for motif discov- ery represent a relatively new approach to improve the ac- curacy of stand-alone motif finders. The performance of an ensemble is clearly determined by the included finders as well as the strategy to combine the results returned by the latter (the so called learning rule). A potential obstacle to a widespread adoption of ensembles is that the choice of the particular finders included is closed. Although possible in principle, the addition to an ensemble of a new "promising" tool requires knowledge of the internals of the ensemble and usually non trivial programming skills. In this research we propose a general architecture for ensem- bles and a prototype called CE3: Customizable and Easily Extensible Ensemble, which is meant to be extensible and customizable at the level of the two key components mod- ules namely external tools finding and learning rule. In this way the user will be able to essentially "simulate" any ex- isting ensemble, create his/her own ensemble according to his/her preferences on finding tools and learning functions and, finally, keep it up to date when new tools and new ideas for learning functions are proposed in literature. These fea- tures also make CE3 a suitable tool to perform experiments that may lead to a proper configuration of ensembles in the research of novel motifs

    Direct vs 2-stage approaches to structured motif finding

    Get PDF
    BACKGROUND: The notion of DNA motif is a mathematical abstraction used to model regions of the DNA (known as Transcription Factor Binding Sites, or TFBSs) that are bound by a given Transcription Factor to regulate gene expression or repression. In turn, DNA structured motifs are a mathematical counterpart that models sets of TFBSs that work in concert in the gene regulations processes of higher eukaryotic organisms. Typically, a structured motif is composed of an ordered set of isolated (or simple) motifs, separated by a variable, but somewhat constrained number of “irrelevant” base-pairs. Discovering structured motifs in a set of DNA sequences is a computationally hard problem that has been addressed by a number of authors using either a direct approach, or via the preliminary identification and successive combination of simple motifs. RESULTS: We describe a computational tool, named SISMA, for the de-novo discovery of structured motifs in a set of DNA sequences. SISMA is an exact, enumerative algorithm, meaning that it finds all the motifs conforming to the specifications. It does so in two stages: first it discovers all the possible component simple motifs, then combines them in a way that respects the given constraints. We developed SISMA mainly with the aim of understanding the potential benefits of such a 2-stage approach w.r.t. direct methods. In fact, no 2-stage software was available for the general problem of structured motif discovery, but only a few tools that solved restricted versions of the problem. We evaluated SISMA against other published tools on a comprehensive benchmark made of both synthetic and real biological datasets. In a significant number of cases, SISMA outperformed the competitors, exhibiting a good performance also in most of the cases in which it was inferior. CONCLUSIONS: A reflection on the results obtained lead us to conclude that a 2-stage approach can be implemented with many advantages over direct approaches. Some of these have to do with greater modularity, ease of parallelization, and the possibility to perform adaptive searches of structured motifs. As another consideration, we noted that most hard instances for SISMA were easy to detect in advance. In these cases one may initially opt for a direct method; or, as a viable alternative in most laboratories, one could run both direct and 2-stage tools in parallel, halting the computations when the first halts

    CMStalker: a combinatorial tool for composite motif discovery

    Get PDF
    Controlling the differential expression of many thousands different genes at any given time is a fundamental task of metazoan organisms and this complex orchestration is controlled by the so-called regulatory genome encoding complex regulatory networks: several Transcription Factors bind to precise DNA regions, so to perform in a cooperative manner a specific regulation task for nearby genes. The in silico prediction of these binding sites is still an open problem, notwithstanding continuous progress and activity in the last two decades. In this paper we describe a new efficient combinatorial approach to the problem of detecting sets of cooperating binding sites in promoter sequences, given in input a database of Transcription Factor Binding Sites encoded as Position Weight Matrices. We present CMStalker, a software tool for composite motif discovery which embodies a new approach that combines a constraint satisfaction formulation with a parameter relaxation technique to explore efficiently the space of possible solutions. Extensive experiments with twelve data sets and eleven state-of-the-art tools are reported, showing an average value of the correlation coefficient of 0.54 (against a value 0.41 of the closest competitor). This improvements in output quality due to CMStalker is statistically significant
    corecore